Context and Subcategories for SlidingWindowObject Recognition

نویسنده

  • Santosh K. Divvala
چکیده

Object recognition is one of the fundamental challenges in computer vision, where the goal is to identify and localize the extent of object instances within an image. The current de facto standard for building high-performance object category detectors is the sliding window approach. This approach involves scanning an image with a fixed-size rectangular window and applying a classifier to the features extracted within the sub-image defined by the window. In this thesis, we study two important factors influencing the performance of the approach. First is the role played by context, where information outside the sliding window is used to rescore the detections output by the local window classifier. Context helps to suppress detections in regions that are less probable to contain an object and encourages those that are more plausible. In the first part of this thesis, we enumerate different sources and uses of context, and comprehensively evaluate their role in a benchmark detection challenge. Our analysis demonstrates that carefully used contextual cues serve not only to improve performance of local classifiers, but also to make their error patterns more meaningful and reasonable. Our analysis also provides a basis for assessing the inherent limitations of the existing approaches as well as the specific problems that remain unsolved. The second factor is the role played by subcategories, where information within the sliding window is used to split the training data into smaller groups, for learning multiple classifiers to model the appearance of an object category. The smaller groups have reduced appearance diversity and thus lead to simpler classification problems. In the second part of this thesis, we analyze different schemes to generate subcategories and find that unsupervised feature-space clustering produces well-performing subcategory classifiers. Beyond performance gains, subcategories are attractive for their conceptual simplicity and computational tractability. For example, we find that careful use of subcategories can potentially replace the need for deformable parts within the state-of-the-art deformable parts model detector for many object categories. Data fragmentation is an important problem associated with subcategory-based methods. We present a novel approach that circumvents this problem by allowing different subcategories to share each other’s training instances.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Interaction Between True Source, Training, and Testing Language Models

An interaction has been found between the true source language model, training language model, and the testing language model. This interaction has implications for vocabulary independent modeling, testing methodologies, discriminative training, and the adequacy of our current databases for continuous speech recognition (CSR) development. The current DARPA databases suffer from the described di...

متن کامل

Allophone-based acoustic modeling for Persian phoneme recognition

Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...

متن کامل

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...

متن کامل

Research priorities in medical education at Shiraz University of Medical Sciences: categories and subcategories in the Iranian context

Introduction: Research in education is a globally significant issuewithout a long history. Due to the importance of the issue in HealthSystem Development programs, this study intended to determineresearch priorities in medical education, considering their detailsand functions. By determining barriers existing in research ineducation progress, it is tried to make research priorities morefunction...

متن کامل

Nurse manager’s recognition behavior with staff nurses in Japan-based on semi-structured interviews

Objective: The purpose of this qualitative study was to obtain a better understanding of nurse manager’s recognition behavior. Methods: This study, consisting of semi-structured interviews, was conducted in five hospitals with 100 beds or more in the Kanto, Kansai, and Kyushu regions of Japan. Fifteen nurse managers, who each had more than one year of professional work experience as a nurse man...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012